Abbreviation Recognition with MaxEnt Model
نویسندگان
چکیده
Abbreviated words carry critical information in the literature of many special domains. This paper reports our research in recognizing dotted abbreviations with MaxEnt model. The key points in our work include: (1) allowing the model to optimize with as many features as possible to capture the text characteristics of context words, and (2) utilizing simple lexical information such as sentence-initial words and candidate word length for performance enhancement. Experimental results show that this approach achieves impressive performance on the WSJ corpus.
منابع مشابه
A hybrid Maxent/HMM based ASR system
The aim of this work is to develop a practical framework, which extends the classical Hidden Markov Models (HMM) for continuous speech recognition based on the Maximum Entropy (MaxEnt) principle. The MaxEnt models can estimate the posterior probabilities directly as with Hybrid NN/HMM connectionist speech recognition systems. In particular, a new acoustic modelling based on discriminative MaxEn...
متن کاملA Hybrid MaxEnt/HMM ba
The aim of this work is to develop a practical framework, which extends the classical Hidden Markov Models (HMM) for continuous speech recognition based on the Maximum Entropy (MaxEnt) principle. The MaxEnt models can estimate the posterior probabilities directly as with Hybrid NN/HMM connectionist speech recognition systems. In particular, a new acoustic modelling based on discriminative MaxEn...
متن کاملA Discriminative Alignment Model for Abbreviation Recognition
This paper presents a discriminative alignment model for extracting abbreviations and their full forms appearing in actual text. The task of abbreviation recognition is formalized as a sequential alignment problem, which finds the optimal alignment (origins of abbreviation letters) between two strings (abbreviation and full form). We design a large amount of finegrained features that directly e...
متن کاملUsing continuous features in the maximum entropy model
We investigate the problem of using continuous features in the maximum entropy (MaxEnt) model. We explain why the MaxEnt model with the moment constraint (MaxEnt-MC) works well with binary features but not with the continuous features. We describe how to enhance constraints on the continuous features and show that the weights associated with the continuous features should be continuous function...
متن کاملA Hybrid Oriya Named Entity Recognition system: Harnessing the Power of Rule
This paper describes a hybrid system that applies maximum entropy (MaxEnt) model with Hidden Markov model (HMM) and some linguistic rules to recognize name entities in Oriya language. The main advantage of our system is, we are using both HMM and MaxEnt model successively with some manually developed linguistic rules. First we are using MaxEnt to identify name entities in Oria corpus, then tagg...
متن کامل